{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "LjutQyZ1ZrXc"
      },
      "source": [
        "##### Copyright 2018 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "3DsECcj8ZhLy"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "jo5PziEC4hWs"
      },
      "source": [
        "# Neural Style Transfer With Eager Execution \u0026 Keras\n",
        "\n",
        "_Notebook originally contributed by: [github.com/raskutti](github.com/raskutti)_\n",
        "\n",
        "\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/raskutti/examples/blob/master/community/en/neural_style_transfer/neural_style_transfer.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://github.com/raskutti/examples/blob/master/community/en/neural_style_transfer/neural_style_transfer.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "\u003c/table\u003e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "aDyGj8DmXCJI"
      },
      "source": [
        "## Overview\n",
        "\n",
        "This notebook demonstrates *neutral style transfer*, a technique used to display an image in the style of a different image. This algorithm is outlined in more detail in [this paper](https://arxiv.org/abs/1508.06576). The material here is heavily based on the awesome work in [this article](https://medium.com/tensorflow/neural-style-transfer-creating-art-with-deep-learning-using-tf-keras-and-eager-execution-7d541ac31398) by [Raymond Yuan](https://github.com/raymond-yuan) and [this notebook](https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/8.3-neural-style-transfer.ipynb) by [Francois Chollet](https://github.com/fchollet).\n",
        "\n",
        "Neural style transfer is an optimization technique that can be applied to produce a *generated image* that conveys the content of a *content image* through the style of a *style image*. Content images are generally object-specific, for example a portrait, while style images are generally background, for example scenery.  \n",
        "\n",
        "As with all deep learning algorithms, neural style transfer defines a loss function then minimizes it. Suppose we have a function $C$ to measure content and a function $S$ to measure style, as well as measures of distance between of two images $x$ and $y$ for content and style, denoted by $L_c(x, y)$ and $L_s(x, y)$ respectively. Then the loss function $L$ is as below, where $k$ is the content image, $m$ is the style image, and $n$ is the generated output image (the variable to minimize over).\n",
        "\n",
        "$$ L(n) = L_c(C(k), C(n)) + L_s(S(m), S(n)) $$\n",
        "\n",
        "The example below takes a graffiti drawing of Eminem as the content image and a [Julia Set fractal](https://en.wikipedia.org/wiki/Julia_set) as the style image. The generated image conveys the same work of Eminem through the style of the fractal. \n",
        "\n",
        "\u003cimg src='https://github.com/raskutti/examples/blob/master/community/en/neural_style_transfer/eminem_fractal.png?raw=1' alt='Drawing' style='width: 200px;'/\u003e\n",
        "Original image by [geishaboy500](\u003chttps://commons.wikimedia.org/wiki/File:Southsea_Skatepark_Graff_%285%29_%283874814819%29.jpg\u003e)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "U8ajP_u73s6m"
      },
      "source": [
        "## Download Images"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "KNH2ARiq2Ede"
      },
      "outputs": [],
      "source": [
        "from __future__ import absolute_import, division, print_function, unicode_literals\n",
        "\n",
        "%tensorflow_version 2.x\n",
        "import tensorflow as tf"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "K5PcXAjolMMZ"
      },
      "outputs": [],
      "source": [
        "import numpy as np\n",
        "import matplotlib.pyplot as plt"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "1jXmDacRCqA9"
      },
      "outputs": [],
      "source": [
        "from tensorflow.keras.preprocessing.image import load_img, img_to_array"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "6LAITUfbP7cJ"
      },
      "outputs": [],
      "source": [
        "!wget --quiet -O 'eminem.jpg' https://upload.wikimedia.org/wikipedia/commons/f/f1/Southsea_Skatepark_Graff_%287%29_%283874828505%29.jpg\n",
        "!wget --quiet -O 'fractal.jpg' https://upload.wikimedia.org/wikipedia/commons/1/17/Julia_set_%28highres_01%29.jpg\n",
        "\n",
        "!ls"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "2yAlRzJZrWM3"
      },
      "source": [
        "We can now display the content image and the style image side by side."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "hc8hz2O9VHD6"
      },
      "outputs": [],
      "source": [
        "plt.figure(figsize = (12, 6))\n",
        "\n",
        "plt.subplot(1, 2, 1)\n",
        "plt.imshow(load_img('eminem.jpg'))\n",
        "plt.axis('off')\n",
        "\n",
        "plt.subplot(1, 2, 2)\n",
        "plt.imshow(load_img('fractal.jpg'))\n",
        "plt.axis('off')\n",
        "\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "7qMVNvEsK-_D"
      },
      "source": [
        "## Processing Images"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "rui4IAgKLJRC"
      },
      "outputs": [],
      "source": [
        "from tensorflow.keras.applications import vgg19"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "B2cWfCwHLKsv"
      },
      "source": [
        "Let's create methods that will allow us to load and preprocess our images easily. We perform the same preprocessing process as are expected according to the VGG training process. VGG networks are trained on image with each channel normalized by `mean = [103.939, 116.779, 123.68]`and with channels BGR."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "gcO8rhYLk2Ki"
      },
      "outputs": [],
      "source": [
        "def preprocess_img(img_path):\n",
        "  \n",
        "  # Set the proportions of the image.\n",
        "  \n",
        "  width, height = load_img(img_path).size\n",
        "  img_height = 500\n",
        "  img_width = int(width * img_height / height)\n",
        "  \n",
        "  img = load_img(img_path, target_size=(img_height, img_width))\n",
        "  img = img_to_array(img)\n",
        "  img = np.expand_dims(img, axis=0)\n",
        "  img = vgg19.preprocess_input(img)\n",
        "  \n",
        "  return img"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "xCgooqs6tAka"
      },
      "source": [
        "In order to view the outputs of our optimization, we are required to perform the inverse preprocessing step. Furthermore, since our optimized image may take its values anywhere between $- \\infty$ and $\\infty$, we must clip to maintain our values from within the 0-255 range.   "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "mjzlKRQRs_y2"
      },
      "outputs": [],
      "source": [
        "def deprocess_img(processed_img):\n",
        "\n",
        "  x = processed_img.copy()\n",
        "  \n",
        "  if len(x.shape) == 4:\n",
        "    x = np.squeeze(x, 0)\n",
        "\n",
        "#   if len(x.shape) != 3:\n",
        "#     raise ValueError('Invalid input to deprocessing image')\n",
        "  \n",
        "  # Perform the inverse of the preprocessing step.\n",
        "  x[:, :, 0] += 103.939\n",
        "  x[:, :, 1] += 116.779\n",
        "  x[:, :, 2] += 123.68\n",
        "  x = x[:, :, ::-1]\n",
        "\n",
        "  x = np.clip(x, 0, 255).astype('uint8')\n",
        "  return x"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "GEwZ7FlwrjoZ"
      },
      "source": [
        "## Define Content \u0026 Style Layers\n",
        "\n",
        "In order to get both the content and style representations of our image, we will look at some intermediate layers within our model. As we go deeper into the model, these intermediate layers represent higher and higher order features. In this case, we are using the network architecture [VGG19](https://keras.io/applications/#vgg19), a pretrained image classification network. These intermediate layers are necessary to define the representation of content and style from our images. For an input image, we will try to match the corresponding style and content target representations at these intermediate layers. \n",
        "\n",
        "So, why do these intermediate outputs within our pretrained image classification network allow us to define style and content representations? At a high level, this phenomenon can be explained by the fact that in order for a network to perform image classification (which our network has been trained to do), it must understand the image. This involves taking the raw image as input pixels and building an internal representation via functions that turn the raw image pixels into an understanding of the features present within the image.\n",
        "\n",
        "This is also partly why convolutional neural networks are able to generalize well. They are able to capture the invariances and defining features within classes (e.g. cats vs dogs) that are agnostic to background noise and other nuisances. Therefore, somewhere between where the raw image is input and the classification label is output, the model serves as a complex feature extractor. So by accessing intermediate layers, we’re able to describe the content and style of input images. \n",
        "\n",
        "\n",
        "We’ll pull the following intermediate layers from our network."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "N4-8eUp_Kc-j"
      },
      "outputs": [],
      "source": [
        "content_layers = [\n",
        "    'block5_conv2',\n",
        "]\n",
        "\n",
        "num_content_layers = len(content_layers)\n",
        "\n",
        "style_layers = [\n",
        "    'block1_conv1',\n",
        "    'block2_conv1',\n",
        "    'block3_conv1', \n",
        "    'block4_conv1', \n",
        "    'block5_conv1',\n",
        "]\n",
        "\n",
        "num_style_layers = len(style_layers)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Jt3i3RRrJiOX"
      },
      "source": [
        "## Build The Model "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "yc3j6OFOMQ88"
      },
      "outputs": [],
      "source": [
        "from tensorflow.python.keras import models\n",
        "import tensorflow.contrib.eager as tfe"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Vn1e4RqwMXQv"
      },
      "source": [
        "VGG19 is a relatively simple model (compared with ResNet, Inception, etc.), and the feature maps tend to work better for style transfer. We feed in our input tensor to the model, then extract the feature maps (and subsequently the content and style representations) of the content, style, and generated images.\n",
        "\n",
        "In order to access the intermediate layers corresponding to our style and content feature maps, we use the [Keras functional API**](https://keras.io/getting-started/functional-api-guide/). With this API, defining a model simply involves defining the input and output, i.e. `model = Model(inputs, outputs)`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "nfec6MuMAbPx"
      },
      "outputs": [],
      "source": [
        "def get_model():\n",
        "  \"\"\"Creates a model with access to intermediate layers. \n",
        "  \n",
        "  These layers will then be used to create a new model that will take the\n",
        "  content image and return the outputs from these intermediate layers from the\n",
        "  VGG model. \n",
        "  \n",
        "  Returns:\n",
        "    A keras model that takes image inputs and outputs the style and content\n",
        "    intermediate layers.\n",
        "  \"\"\"\n",
        "\n",
        "  vgg = vgg19.VGG19(include_top=False, weights='imagenet')\n",
        "  vgg.trainable = False\n",
        " \n",
        "  style_outputs = [vgg.get_layer(name).output for name in style_layers]\n",
        "  content_outputs = [vgg.get_layer(name).output for name in content_layers]\n",
        "  model_outputs = style_outputs + content_outputs\n",
        "\n",
        "  return models.Model(vgg.input, model_outputs)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "vJdYvJTZ4bdS"
      },
      "source": [
        "## Define Content \u0026 Style Loss Functions"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "1FvH-gwXi4nq"
      },
      "source": [
        "### Content Loss\n",
        "\n",
        "The function that defines content loss will take both the desired content image and our base input image. These images are passed to the network, and will return the intermediate layer outputs from our model. Then, the loss simply takes the Euclidean distance between the two intermediate representations of those images.  \n",
        "\n",
        "More formally, let $N$ be a pre-trained deep convolutional neural network. Let $X$ be any image, then $N(X)$ is the network fed by $X$. Let $A^l(k) \\in N(k)$ and $B^l(n) \\in N(n)$ describe the respective intermediate feature representation of the network with inputs $k$ and $n$ at layer $l$. Then we describe the content loss $L_c^l$ formally as below.\n",
        "\n",
        "$$L_c^l(k, n) = \\sum_N (A^l(k) - B^l(n))^2$$\n",
        "\n",
        "We can use backpropogration to minimize content loss, thus changing the initial image until it generates a similar response in a given layer as the original content image.\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "d2mf7JwRMkCd"
      },
      "outputs": [],
      "source": [
        "def compute_content_loss(base_content, target):\n",
        "  return tf.reduce_mean(tf.square(base_content - target))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "lGUfttK9F8d5"
      },
      "source": [
        "### Style Loss"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "I6XtkGK_YGD1"
      },
      "source": [
        "Computing style loss is a bit more involved, but follows the same principle, this time feeding our network the base input image and the style image. However, instead of comparing the raw intermediate outputs of the base input image and the style image, we instead compare the [Gram matrices](https://en.wikipedia.org/wiki/Gramian_matrix) of the two outputs. \n",
        "\n",
        "Mathematically, we describe the style representation of an image as the correlation between different filter responses given by the Gram matrix  $G^l$, where $G^l_{ij}$ is the inner product (and  represents the correlation) between the vectorized feature map $i$ and $j$ in layer $l$. \n",
        "\n",
        "To generate a style for our base input image, we perform [gradient descent](https://developers.google.com/machine-learning/crash-course/reducing-loss/gradient-descent) from the content image to transform it into an image that matches the style representation of the original image. We do so by minimizing the mean squared distance between the feature correlation map of the style image and the input image. The contribution $E_l$ of each layer $l$ to the total style loss is described by\n",
        "\n",
        "$$E_l(m, n) = \\frac{1}{4C_l^2D_l^2} \\sum_{i,j}(G^l_{ij}(m) - G^l_{ij}(n))^2$$\n",
        "\n",
        "where $C_l$ is the number of feature maps, each of size $D_l = \\textrm{height} \\cdot \\textrm{width}$. \n",
        "\n",
        "Thus, the total style loss $L_s$ across each layer $l$ is \n",
        "\n",
        "$$L_s(m, n) = \\sum_l w_l E_l(m, n)$$\n",
        "\n",
        "where we weight the contribution of each layer's loss by some factor $w_l$. In our case, we weight each layer equally, so $w_l = w \\ \\forall \\ l$."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "N7MOqwKLLke8"
      },
      "outputs": [],
      "source": [
        "def gram_matrix(input_tensor):\n",
        "\n",
        "  channels = int(input_tensor.shape[-1])\n",
        "  a = tf.reshape(input_tensor, [-1, channels])\n",
        "  n = tf.shape(a)[0]\n",
        "  gram = tf.matmul(a, a, transpose_a=True)\n",
        "  return gram / tf.cast(n, tf.float32)\n",
        "\n",
        "def compute_style_loss(base_style, gram_target):\n",
        "\n",
        "  height, width, channels = base_style.get_shape().as_list()\n",
        "  gram_style = gram_matrix(base_style)\n",
        "  \n",
        "  return tf.reduce_mean(tf.square(gram_style - gram_target))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "y9r8Lyjb_m0u"
      },
      "source": [
        "## Apply Neural Style Transfer\n",
        "\n",
        "We use the [Adam](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam) optimizer in order to minimize our loss. We iteratively update our output image such that it minimizes our loss. In order to do this, we must know how we calculate our loss and gradients. The [L-BFGS](https://en.wikipedia.org/wiki/Limited-memory_BFGS) optimization algorithm is recommended but is not used in this tutorial, as using Adam allows us to demonstrate the autograd/gradient tape functionality with custom training loops, as per eager best practices.\n",
        "\n",
        "We’ll define a little helper function that will load our content and style image, feed them forward through our network, which will then output the content and style feature representations from our model. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "O-lj5LxgtmnI"
      },
      "outputs": [],
      "source": [
        "def feature_representations(model, content_path, style_path):\n",
        "  \"\"\"Helper function to compute our content and style feature representations.\n",
        "\n",
        "  This function will simply load and preprocess both the content and style \n",
        "  images from their path. Then it will feed them through the network to obtain\n",
        "  the outputs of the intermediate layers. \n",
        "  \n",
        "  Arguments:\n",
        "    model: the model that we are using\n",
        "    content_path: the path to the content image\n",
        "    style_path: the path to the style image\n",
        "    \n",
        "  Returns:\n",
        "    The style and content features.\n",
        "  \"\"\"\n",
        "\n",
        "  content_img = preprocess_img(content_path)\n",
        "  style_img = preprocess_img(style_path)\n",
        "\n",
        "  style_outputs = model(style_img)\n",
        "  content_outputs = model(content_img)\n",
        "  \n",
        "  \n",
        "  style_features = [\n",
        "      style_layer[0] for style_layer in style_outputs[:num_style_layers]]\n",
        "  content_features = [\n",
        "      content_layer[0] for content_layer in content_outputs[num_style_layers:]]\n",
        "\n",
        "  return style_features, content_features"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "3DopXw7-lFHa"
      },
      "source": [
        "### Computing Loss \u0026 Gradients\n",
        "\n",
        "Here we use [tf.GradientTape](https://www.tensorflow.org/programmers_guide/eager#computing_gradients) to compute the gradient. It allows us to take advantage of the automatic differentiation available by tracing operations for computing the gradient later. It records the operations during the forward pass and then is able to compute the gradient of our loss function with respect to our input image for the backwards pass."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "oVDhSo8iJunf"
      },
      "outputs": [],
      "source": [
        "def compute_loss(\n",
        "    model, loss_weights, init_img, gram_style_features, content_features):\n",
        "  \"\"\"Computes the total loss.\n",
        "  \n",
        "  Arguments:\n",
        "    model: the model that will give us access to the intermediate layers\n",
        "    loss_weights: the weights of each contribution of each loss function\n",
        "      (style weight, content weight, and total variation weight)\n",
        "    init_img: the initial base image, that is updated according to the\n",
        "      optimization process\n",
        "    gram_style_features: precomputed gram matrices corresponding to the \n",
        "      defined style layers of interest\n",
        "    content_features: precomputed outputs from defined content layers of\n",
        "      interest\n",
        "      \n",
        "  Returns:\n",
        "    The total loss, style loss, content loss, and total variational loss.\n",
        "  \"\"\"\n",
        "  style_weight, content_weight = loss_weights\n",
        "  \n",
        "  # Feed our init image through our model. This will give us the content and \n",
        "  # style representations at our desired layers.\n",
        "  model_outputs = model(init_img)\n",
        "  \n",
        "  style_output_features = model_outputs[:num_style_layers]\n",
        "  content_output_features = model_outputs[num_style_layers:]\n",
        "  \n",
        "  style_loss, content_loss = 0, 0\n",
        "\n",
        "  # Accumulate style losses from all layers. All weights are equal.\n",
        "  style_layer_weight = 1.0 / num_style_layers\n",
        "\n",
        "  for target_style, generated_style in zip(\n",
        "      gram_style_features, style_output_features):\n",
        "    style_loss += style_layer_weight * compute_style_loss(\n",
        "        generated_style[0], target_style)\n",
        "    \n",
        "  # Accumulate content losses from all layers. All weights are equal.\n",
        "  content_layer_weight = 1.0 / num_content_layers\n",
        "  for target_content, generated_content in zip(\n",
        "      content_features, content_output_features):\n",
        "    content_loss += content_layer_weight * compute_content_loss(\n",
        "        generated_content[0], target_content)\n",
        "  \n",
        "  style_loss *= style_weight\n",
        "  content_loss *= content_weight\n",
        "\n",
        "  total_loss = style_loss + content_loss \n",
        "\n",
        "  return total_loss, style_loss, content_loss"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "fwzYeOqOUH9_"
      },
      "outputs": [],
      "source": [
        "def compute_gradients(cfg):\n",
        "  with tf.GradientTape() as tape: \n",
        "    all_loss = compute_loss(**cfg)\n",
        "\n",
        "  total_loss = all_loss[0]\n",
        "\n",
        "  return tape.gradient(total_loss, cfg['init_img']), all_loss"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "T9yKu2PLlBIE"
      },
      "source": [
        "### Optimization Loop"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "WyTG7HBzHBav"
      },
      "outputs": [],
      "source": [
        "import time\n",
        "import IPython\n",
        "from PIL import Image\n",
        "import IPython.display"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "qMoKS0paK8fu"
      },
      "source": [
        "We now combine all the functions above into this optimization loop. While this looks like a lot of code, a significant portion of it is dedicated to displaying generated images and reporting loss and time."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "pj_enNo6tACQ"
      },
      "outputs": [],
      "source": [
        "def run_style_transfer(content_path, style_path, n_iterations=1000,\n",
        "                       content_weight=1e4, style_weight=1e-4,\n",
        "                       display_iterations=True):\n",
        "  \"\"\"Run the neural style transfer algorithm.\n",
        "  \n",
        "  Arguments:\n",
        "    content_path: the filename of the target content image\n",
        "    style_path: the filename of the reference style image\n",
        "    content_weight: the weight for the content features, where higher means the\n",
        "      generated image will put heavier emphasis on content (default 1e-4)\n",
        "    style_weight: the weight for the style features, where higher means the\n",
        "      generated image put heavier emphasis on style (default 1e4)\n",
        "    n_iterations: the number of optimization iterations (default 1000)\n",
        "    display_iterations: whether to display intermediate iterations of the\n",
        "      generated images (default True)\n",
        "    \n",
        "  Returns:\n",
        "    The final generated image and the total loss for that image.\n",
        "  \"\"\"\n",
        "\n",
        "  model = get_model() \n",
        "  \n",
        "  # We don't need to (or want to) train any layers of our model, so we set their\n",
        "  # trainable to false. \n",
        "  for layer in model.layers:\n",
        "    layer.trainable = False\n",
        "  \n",
        "  style_features, content_features = feature_representations(\n",
        "      model, content_path, style_path)\n",
        "\n",
        "  gram_style_features = [\n",
        "      gram_matrix(style_feature) for style_feature in style_features\n",
        "  ]\n",
        "  \n",
        "  init_img = preprocess_img(content_path)\n",
        "  init_img = tfe.Variable(init_img, dtype=tf.float32)\n",
        "\n",
        "  # The optimizer params are somewhat arbitrary.\n",
        "  # See tensorflow.org/api_docs/python/tf/keras/optimizers/Adam#__init__\n",
        "  opt = tf.train.AdamOptimizer(learning_rate=5, beta1=0.99, epsilon=1e-1)\n",
        "  \n",
        "  # Store the result that minimizes loss as the best one.\n",
        "  best_loss, best_img = float('inf'), None\n",
        "  \n",
        "  # Create a nice config \n",
        "  loss_weights = (style_weight, content_weight)\n",
        "  cfg = {\n",
        "      'model':               model,\n",
        "      'loss_weights':        loss_weights,\n",
        "      'init_img':            init_img,\n",
        "      'gram_style_features': gram_style_features,\n",
        "      'content_features':    content_features\n",
        "  }\n",
        "\n",
        "  start_time = time.time()\n",
        "  global_start = time.time()\n",
        "  \n",
        "  norm_means = np.array([103.939, 116.779, 123.68])\n",
        "  min_vals = -norm_means\n",
        "  max_vals = 255 - norm_means   \n",
        "\n",
        "  imgs = []\n",
        "  for i in range(n_iterations):\n",
        "    \n",
        "    gradients, all_loss = compute_gradients(cfg)\n",
        "    total_loss, style_loss, content_loss = all_loss\n",
        "    opt.apply_gradients([(gradients, init_img)])\n",
        "    clipped = tf.clip_by_value(init_img, min_vals, max_vals)\n",
        "    init_img.assign(clipped)\n",
        "    end_time = time.time() \n",
        "    \n",
        "    # Update best loss and best image from total loss. \n",
        "    if total_loss \u003c best_loss:\n",
        "      best_loss = total_loss\n",
        "      best_img = deprocess_img(init_img.numpy())\n",
        "      \n",
        "    if display_iterations:\n",
        "      \n",
        "      n_rows, n_cols = 2, 5\n",
        "      display_interval = n_iterations / (n_rows * n_cols)\n",
        "  \n",
        "      if i % display_interval == 0:\n",
        "        start_time = time.time()\n",
        "\n",
        "        plot_img = deprocess_img(init_img.numpy())\n",
        "        imgs.append(plot_img)\n",
        "\n",
        "        IPython.display.clear_output(wait=True)\n",
        "        IPython.display.display_png(Image.fromarray(plot_img))\n",
        "\n",
        "        print('Iteration: {}'.format(i))        \n",
        "        print('Total loss: {:.4e}, ' \n",
        "              'style loss: {:.4e}, '\n",
        "              'content loss: {:.4e}, '\n",
        "              'time: {:.4f}s'.format(total_loss, style_loss, content_loss,\n",
        "                                     time.time() - start_time))\n",
        "\n",
        "  if display_iterations:\n",
        "    IPython.display.clear_output(wait=True)\n",
        "\n",
        "    plt.figure(figsize=(14,4))\n",
        "\n",
        "    for i,img in enumerate(imgs):\n",
        "        plt.subplot(n_rows, n_cols, i+1)\n",
        "        plt.imshow(img)\n",
        "        plt.axis('off')\n",
        "    \n",
        "    print('Total time: {:.4f}s'.format(time.time() - global_start))\n",
        "      \n",
        "  return best_img, best_loss"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "vSVMx4burydi"
      },
      "outputs": [],
      "source": [
        "best_img, best_loss = run_style_transfer('eminem.jpg', 'fractal.jpg')\n",
        "print(best_loss.numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "LwiZfCW0AZwt"
      },
      "source": [
        "Let's visualize the final generated image."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "lqTQN1PjulV9"
      },
      "outputs": [],
      "source": [
        "plt.figure(figsize=(10, 10))\n",
        "\n",
        "plt.imshow(best_img)\n",
        "plt.axis('off')\n",
        "\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "NrCfAxrmxH_V"
      },
      "source": [
        "Let's display all three images side by side."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "-d-Wa9Q6v0xk"
      },
      "outputs": [],
      "source": [
        "plt.figure(figsize=(20, 60))\n",
        "\n",
        "plt.subplot(1, 3, 1)\n",
        "plt.imshow(load_img('eminem.jpg'))\n",
        "plt.axis('off')\n",
        "plt.title('Content Image', fontdict = {'fontsize' : 40})\n",
        "\n",
        "plt.subplot(1, 3, 2)\n",
        "plt.imshow(load_img('fractal.jpg'))\n",
        "plt.axis('off')\n",
        "plt.title('Style Image', fontdict = {'fontsize' : 40})\n",
        "\n",
        "plt.subplot(1, 3, 3)\n",
        "plt.imshow(best_img)\n",
        "plt.axis('off')\n",
        "plt.title('Generated Image', fontdict = {'fontsize' : 40})\n",
        "\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "tyGMmWh2Pss8"
      },
      "source": [
        "## Another Example\n",
        "\n",
        "Now let's see what Times Square would look like when painted by Monet!\n",
        "\n",
        "\u003cimg src='https://github.com/raskutti/examples/blob/master/community/en/neural_style_transfer/times_square_monet.png?raw=1' alt='Drawing' style='width: 200px;'/\u003e\n",
        "original image by [Rafi B. from Somewhere in Texas :)](https://commons.wikimedia.org/wiki/File:Times_square_at_night.jpg)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "xlCpkbYzI8wC"
      },
      "outputs": [],
      "source": [
        "!wget --quiet -O 'times_square.jpg' https://upload.wikimedia.org/wikipedia/commons/9/9c/Times_square_at_night.jpg\n",
        "!wget --quiet -O 'water_lilies.jpg' https://upload.wikimedia.org/wikipedia/commons/5/5d/Monet_Water_Lilies_1916.jpg\n",
        "  \n",
        "!ls"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "ec-g6Ah1JywE"
      },
      "outputs": [],
      "source": [
        "best_img, _ = run_style_transfer('times_square.jpg', 'water_lilies.jpg',\n",
        "                                 display_iterations=False)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "Wi3c9mPxKs0P"
      },
      "outputs": [],
      "source": [
        "plt.figure(figsize=(20, 60))\n",
        "\n",
        "plt.subplot(1, 3, 1)\n",
        "plt.imshow(load_img('times_square.jpg'))\n",
        "plt.axis('off')\n",
        "plt.title('Content Image', fontdict = {'fontsize' : 40})\n",
        "\n",
        "plt.subplot(1, 3, 2)\n",
        "plt.imshow(load_img('water_lilies.jpg'))\n",
        "plt.axis('off')\n",
        "plt.title('Style Image', fontdict = {'fontsize' : 40})\n",
        "\n",
        "plt.subplot(1, 3, 3)\n",
        "plt.imshow(best_img)\n",
        "plt.axis('off')\n",
        "plt.title('Generated Image', fontdict = {'fontsize' : 40})\n",
        "\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "N_ZPAWDIbWBz"
      },
      "source": [
        "We can also tweak the `content_weight` and `style_weight` parameters of `run_style_transfer` to change the final generated image. The higher the `content_weight` parameter, the more content-heavy the generated image will be, and the higher the `style_weight` parameter, the more style-heavy the generated image will be. \n",
        "\n",
        "Note that increasing the `content_weight` will have a similar effect to decreasing the `style_weight`, and vice versa."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "TblM3S5ibvFj"
      },
      "outputs": [],
      "source": [
        "style_heavy_img, _ = run_style_transfer('times_square.jpg', 'water_lilies.jpg',\n",
        "                                        style_weight=1,\n",
        "                                        display_iterations=False)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "wNq9T0rjcAOD"
      },
      "outputs": [],
      "source": [
        "plt.imshow(style_heavy_img)\n",
        "plt.axis('off')\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "QIhZUMmOqmLi"
      },
      "outputs": [],
      "source": [
        "content_heavy_img, _ = run_style_transfer('times_square.jpg', 'water_lilies.jpg',\n",
        "                                          content_weight=1e8,\n",
        "                                          display_iterations=False)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "MH9KcWAPqy1X"
      },
      "outputs": [],
      "source": [
        "plt.imshow(content_heavy_img)\n",
        "plt.axis('off')\n",
        "plt.show()"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "collapsed_sections": [],
      "name": "neural_style_transfer.ipynb",
      "private_outputs": true,
      "provenance": [],
      "toc_visible": true,
      "version": "0.3.2"
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
